Unsupervised Learning of the Morphology of a Natural Language
نویسنده
چکیده
This study reports the results of using minimum description length (MDL) analysis to model unsupervised learning of the morphological segmentation of European languages, using corpora ranging in size from 5,000 words to 500,000 words. We develop a set of heuristics that rapidly develop a probabilistic morphological grammar, and use MDL as our primary tool to determine whether themodications proposed by the heuristicswill be adopted ornot. The resulting grammar matches well the analysis that would be developed by a human morphologist. In the nal section, we discuss the relationship of this style of MDL grammatical analysis to the notion of evaluation metric in early generative grammar.
منابع مشابه
An algorithm for the unsupervised learning of morphology
This paper describes in detail an algorithm for the unsupervised learning of natural language morphology, with emphasis on challenges that are encountered in languages typologically similar to European languages. It utilizes the Minimum Description Length analysis described in Goldsmith 2001 and has been implemented in software that is available for downloading and testing. 1. Scope of this pap...
متن کاملNatural Language Processing Of Morphology With Linguistically Motivated Applications To German Linking Elements
A survey of the history of the learning of morphological rules is presented. Further investigation is made into the current state of NLP techniques with regards to supervised and unsupervised learning morphology. An analysis of the outstanding problem of “German linking elements” is presented and reviewed. Finally, a proposal is made with the goal of applying current morphological analysis and ...
متن کاملPresentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملPresentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملUnsupervised Morphological Relatedness
Assessment of the similarities between texts has been studied for decades from different perspectives and for several purposes. One interesting perspective is the morphology. This article reports the results on a study on the assessment of the morphological relatedness between natural language words. The main idea is to adapt a formal string alignment algorithm namely Needleman-Wunsch’s to acco...
متن کاملExperiments in Unsupervised Learning of Natural Language
Linguistics has invented and discarded many theories of language, and there are currently many competitors to the basic idea of phrase structure grammars as capturing the syntactic structure of language. Computational Linguistics has proven to be a testing ground for theories and grammars, and is similarly diverse. Moreover recently we have learnt that the similar principles and techniques may ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Linguistics
دوره 27 شماره
صفحات -
تاریخ انتشار 2001